Structured Bayesian Compression for Deep Neural Networks Based on the Turbo-VBI Approach

نویسندگان

چکیده

With the growth of neural network size, model compression has attracted increasing interest in recent research. As one most common techniques, pruning been studied for a long time. By exploiting structured sparsity network, existing methods can prune neurons instead individual weights. However, methods, surviving are randomly connected without any structure, and non-zero weights within each neuron also distributed. Such irregular sparse structure cause very high control overhead memory access hardware even increase computational complexity. In this paper, we propose three-layer hierarchical prior to promote more regular during pruning. The proposed achieve per-neuron weight-level neuron-level sparsity. We derive an efficient Turbo-variational Bayesian inferencing (Turbo-VBI) algorithm solve resulting problem with prior. Turbo-VBI low complexity support general priors than algorithms. Simulation results show that our pruned networks while achieving better performance terms rate accuracy compared baselines.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Integrating Deep Neural Networks into Structured Classification Approach based on Weighted Finite-State Transducers

Recently, deep neural networks (DNNs) have been drawing the attention of speech researchers because of their capability for handling nonlinearity in speech feature vectors. On the other hand, speech recognition based on structured classification is also considered important since it realizes the direct classification of automatic speech recognition. For example, a structured classification meth...

متن کامل

Bayesian Incremental Learning for Deep Neural Networks

In industrial machine learning pipelines, data often arrive in parts. Particularly in the case of deep neural networks, it may be too expensive to train the model from scratch each time, so one would rather use a previously learned model and the new data to improve performance. However, deep neural networks are prone to getting stuck in a suboptimal solution when trained on only new data as com...

متن کامل

Compression of Deep Neural Networks on the Fly

Thanks to their state-of-the-art performance, deep neural networks are increasingly used for object recognition. To achieve the best results, they use millions of parameters to be trained. However, when targetting embedded applications the size of these models becomes problematic. As a consequence, their usage on smartphones or other resource limited devices is prohibited. In this paper we intr...

متن کامل

Attention-Based Guided Structured Sparsity of Deep Neural Networks

Network pruning is aimed at imposing sparsity in a neural network architecture by increasing the portion of zero-valued weights for reducing its size regarding energyefficiency consideration and increasing evaluation speed. In most of the conducted research efforts, the sparsity is enforced for network pruning without any attention to the internal network characteristics such as unbalanced outp...

متن کامل

Learning Structured Sparsity in Deep Neural Networks

High demand for computation resources severely hinders deployment of large-scale Deep Neural Networks (DNN) in resource constrained devices. In this work, we propose a Structured Sparsity Learning (SSL) method to regularize the structures (i.e., filters, channels, filter shapes, and layer depth) of DNNs. SSL can: (1) learn a compact structure from a bigger DNN to reduce computation cost; (2) ob...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Signal Processing

سال: 2023

ISSN: ['1053-587X', '1941-0476']

DOI: https://doi.org/10.1109/tsp.2023.3252165